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Abstract 

We outline two alternate approaches to predicting the 
onset of congestion in a packet switching satellite, and argue 
that predictive, rather than reactive, flow control is 
necessary for the efficient operation of such a system. The 
first method discussed is based on standard, statistical 
techniques which are used to periodically calculate a 
probability of near-term congestion based on arrival rate 
statistics. If this probability exceeds a preset threshold, the 
satellite would transmit a rate-reduction signal to all active 
ground stations. The second method discussed would utilize 
a neural network to periodically predict the occurrence of 
buffer overflow based on input data which would include, 
in addition to arrival rates, the distributions of packet 
lengths, source addresses and destination addresses. 

I. Motivation 

Consider a node in a packet switched network 
characterized by m inbound links and n outbound links. 
Inbound packets are routed over a switching fabric to their 
appropriate outbound links based on their individual 
destination addresses. We begin by making the general 
assumptions that packet lengths are variable and that we 
have no a priori information concerning the statistical nature 
of the packet routing patterns. 

Let C; represent the temporal capacity of link i. At any 
instant in time f , a link carries an amount of traffic equal to 
some fraction v x of its total capacity. In what follows, we 
define heal buffer congestion to be the local condign 
obtaining when the totality of inbound traffic destined for a 
particular outbound link exceeds the capacity of that link. 
More formally, local congestion occurs when 


£ v?c, > c t (') 

M 

where is the fraction of inbound traffic on link i destined 
for transmission on outbound link j . 

The impact of local congestion on the performance of the 
network depends on its magnitude and on the length of time 
the condition persists. If we assume that the node contains 
some quantity of buffer memory for each outbound link, 
then the onset of congestion causes the buffer to fill to 
capacity, whereupon newly arriving packets are blocked. 
Thus, if blocking is used as a figure of merit, the length of 
time a node can sustain local congestion is proportional to 
the amount of buffer memory it supports and the local input 
rate. Considerable research has been carried out on 
optimizing the amount of memory built into packet- 
switching nodes given specific network operating protocols 
and performance specifications (cf. [1]). 

Because of the stochastic nature of the inbound traffic, 
usually with respect to packet arrival rates, service rates, 
and destination routing requirements, overflows will occur 
with some finite probability. Two basic approaches to the 
problem are, 1), attempting to estimate the maximum 
duration of such events and provide sufficient buffer to 
minimize packet loss, and, 2), to use flow control 
techniques to minimize the probability of the occurrence of 
overflows. Typical network architectures incorporate 
various combinations of these (and other) methods to 
mitigate the effects of congestion. Many packet-oriented 
communications protocols require the retransmission of 
blocked packets, which can lead to the global condition of 
network congestion, wherein the average packet throughput 
delay becomes intolerably large. 

It is difficult to efficiently adapt flow control schemes 
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designed for low- to moderate-rate terrestrial networks to 
links characterized by high bandwidth-delay products, such 
as high-speed fiber optic links or geostationary satellite 
links. In particular, the advent of commercial 
communication satellites capable of on-board packet routing 
poses serious challenges in efficiently processing packet 
traffic routed through a satellite with the objective of 
minimizing both packet loss and throughput delay. 

We consider the former to be the more significant 
problem. Common terrestrial protocols recognize the onset 
of congestion after the fact through detection of anomalously 
high throughput delay and/or traffic blocking. The volume 
of packets lest before this congestion is brought under 
control, due to the relatively short length of terrestrial links, 
is quite small relative to that which would be lost on the 
much longer geostationary Garth-space link. Moreover, 
retransmission of lost packet'' from ground stations to the 
satellite, again due to the inherently large propagation delay, 
can clearly have a severe impact on end-to-end performance 
(extremely severe in, for example, the case of go-back-n 
ARQ protocols [2]). On the other hand, the in situ 
queueing and transmission delay suffered by a packet as a 
consequence of onboard processing operations would be 
relatively inconsequential compared to the unavoidable 
propagation delay experienced by the packet. Thus, in this 
work, we consider the mam problem to be that of 
preventing local congestion (onboard buffer overflow) and 
subsequent packet blocking and loss by attempting to predict 
the onset of such a condition prior to its occurrence and 
controlling it through a rate-based, feedback control 
mechanism. 

It should be noted that we do not address the problem of 
admission control to the network, which is usually a 
function of the call setup procedure. In this work, we deal 
with those sessions which have already been admitted. We 
operate on the assumption that the true packet rate and 
routing distributions of an individual user requesting access 
are unknown at the time of call setup, thus the user traffic 
characteristics may change during the course of the session, 
possibly resulting in congestion. 

We outline two approaches to controlling the rate of 
inbound (uplink) traffic destined for a particular outbound 
link (downlink) in a satellite system architecture supporting 
full, destination -directed packet routing and packet 
buffering. The first approach centers on statistical 
estimation and prediction, while the second is based on 
neural networks which capture finite time series behavior. 
Both approaches use time-history data on the behavior of the 
arrival rate to predict the near-term occurrence of an 
overflow, at which time a rate-control (or throttle) signal is 
transmitted by the satellite to each ground station. We 


begin by developing a conceptual model of the system 
architecture. 

II. A System Model 

The general description of the packet switching node 
given above can be extended and formalized. Our goal is 
a simple architectural model which is ''reasonable* in the 
sense that the number of (possibly unrealistic) assumptions 
needed concerning its operation is minimal. 

As before, we assume that the architecture supports m 
inbound and n outbound links (the distinction is logical, not 
physical; that is, the satellite supports m uplink channels 
and n downlink channels). We make no assumption at this 
point concerning the satellite access method (i.e. TDMA, 
FDMA, etc.). The communications payload contains a 
switching fabric capable of routing individual packets from 
any uplink to any downlink. Each downlink channel i is 
preceded by an amount of store-and-forward buffer b, 
(which we will take as fixed, although we do not rule out 
dynamic allocation schemes). Placing the buffer memory 
between the routing switch and the downlink channel 
circuitry, though necessary to the specific formalism 
developed here, is not crucial to the basic approach. 

A key requirement of the control mechanisms to be 
developed is that the system architecture support a common 
signalling channel or priority broadcast mode by which low- 
rate flow control commands can be transmitted from the 
satellite and received simultaneously by all active ground 
stations independently of user traffic. 



Fig. 1 










Figure 1 schematically illustrates the system mode! in the 
context of a generic switching satellite architecture. 

We next turn to a fundamental characteristic of any flow 
control model* which is the assumed nature of the behavior 
of the traffic arrival streams. The traditional mathematical 
representation of packet arrivals, the Poisson pror+ss model* 
requins the assumptions that the interarrival tint j between 
successive packets are exponentially distributed and that the 
key parameters (such as packet length and destination) cf 
successive packets are statistically independent. Clearly* 
these assumptions are of questionable value in many (or 
even ^ost) realistic situations; however, there are occasions 
in which they are usually valid. KJeiuroek [3], for example, 
demonstrated that multiplexing a sufficiently large number 
of packet streams arriving from independent sources results 
in a composite stream exhibiting Markovian behavior. 



Fig. 2 


The system model proposed can be viewed as a queueing 
system as illustrated in Figure 2. Here we represent the 
totality of uplink traffic as a single stream which is 
demultiplexed (in a sense) by the switching fabric. In other 
words, we assume that the switch acts to randomly select 
packets and route them to the downlink queues, thus 
resulting in n individual arrival streams exhibiting 
Markovia-i behavior (in other words, packets arriving from 
independent sources are multiplexed by the switching fabric 
and appear at the output of switch in such a way that 
successive packets are no longer correlated). Depending on 
the actual architecture of the switching fabric, each 
downlink channel may well observe a packet stream which 
is (very nearly) governed by a Poisson distribution. 

The importance of such behavior is most evident in the 
construction of steady-state queueing models of network 
nodes, particularly in the cose of the M/M/1 queue* in 
which both the packet arrivals and lengths are governed by 
a Poisson distribution with rate parameter X. In the steady- 
state, the expected value of the number of packets in the 


buffer is given by 


E(n) 


(l-*A0 


( 2 ) 


where X is the average arrival rate and p is the average 
packet length. Although queueing models have been used 
v.itb much success to design communication networks* their 
obvious drawback is the requirement that one can typically 
deal only with steady-stale, average behavior. In attempting 
to develop a control mechanism to circumvent congestion* 
we are* in fact, almost exclusively concerned with the 
transient behavior of the system (although significant work 
has been done in the transient behavior of queues, most 
tractable aspects of the theory are restricted to steady-state 
analyses, cf. [4]). 


A some vhat better method of dealing with traffic streams 
which do not readily lend themselves to steady-state 
treatment is to assume that the arrivals are governed by 
nonstationary probability distributions. In the simplest case 
of the stationary Poisson model, the probability that k 
packets will arrive within a time interval r is given by the 
expression 


Pr[A-=*] = (3) 

J fc! 

where X is the average packet arrival rate defined by 

X » lim Exp. Arrivals in [0*r] , 4) 

Pragmatically, X is assumed to exist and is generally taken 
to be a long-term average of arrival rates of an ergodic 
point process, and can vary significantly from shorter-term 
or rapid fluctuations (characterised by higher-order 
statistics). Longer-term fluctuations are characterized by 
more gradual changes in the value cf the parameter of the 
governing probability distribution. 

If, on the other hand, the arrival process is governed by 
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the nonstationary Poisson distribution (i.e. by the 
nonhomogeneous Poisson process ), the behavior is taken to 
result fiom stochastic time-dependency of the average rate 
parameter, viz: X = X(t). Notice that X(t) may itself be a 
stochastic process of indeterminate statistics. Figure 3 
illustrates a typical uncontrolled arrival stream subject to 
both types of behavior. Both short and long-term variations 
are clearly evident. Formally, the rate parameter X(t) is 
defined by 

A«) « ^ (5) 

at 


where A(t) is termed the expectation Junction and is assumed 
to be differentiable at an instant t. The probability of k 
arJvals occurring in a time interval of length r= [t,t+s| is 
now 


A0M, . (6) 

where 


t+M 


A (t+s) -A(0 = f k(u)du 

i 


or, by substitution, 


/VCiC=Jt] 
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The motivation here is simply that we hope to realize 
bounds on the short-term transients (which are governed by 
higher-order statistics), whereas changes in the average rate 
parameter X(t), though still of a stochastic nature, will be 
driven by a deterministic, underlying structure. Learning 
and adapting to this structure is the essence of both the 
statistical and neural network feedback control mechanisms 
proposed. By developing a mechanism which can learn and 
predict the relatively near-term behavior of X(t) and 
combining it with a state-variable representation of the 
average number of packets in queue at time t, we can 
attempt to predict the point at which congestion will occur 
and initiate a rate-reduction in the arrival stream. This 
approach is only one of many filtering and prediction 
methods available through classical stochastic control theory; 
we choose it for its apparent operational simplicity. We 
elaborate on this approach in Section III. 

We digress to note ihat it is possible to combine the 
time-varying arrival rate with the steady-state M/M/1 
queueing model to derive a state equation describing the 
time evolution of the queue length. Filipiak [5] (among 


others) matches the steady-state equilibrium point of the 
queueing model with that of the dynamic model to obtain the 
differential equation 
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which has the initial condition x(0)=?^. This expression 
describes the evolution of the mean number in the queueing 
system over time, and does not, in general, account for 
short-term transients. However, if X(t) is taken as the 
ensemble average of the arrival process and it is known (or 
can be shown) that the variance of the process is relatively 
small , the state variable can be a fairly accurate predictor of 
the dynamics of the system, if this is the case, standard 
optimal control techniques (Hamiiton-Jacobi) can be applied 
to the state evolution expression above to derive an optimal 
control to minimize the blocking probabilities [6]. We do 
not assume here that purely steady-state behavior will be 
sufficient to accurately predict the onset of congestion. 

Our motivation for examining both statistical and neural 
network control mechanisms is that statistical methods are 
well-understood ana have been extensively analyzed and 
used. Thus, even though stochastic formulations must 
necessarily rely on certain working assumptions (i.e. be 
"model-based"), we are at least able to determine when such 
assumptions are— or are not— valid. On the other hand, 
neural networks are not yet well understood; a significant 
amount of trial and error testing is usually required to 
develop functional prototypes and the underlying theory is 
far from comprehensive. Nonetheless, the attraction of 
neural networks here lies in their potential for model-free 
estimation and prediction, coupled with their ability to 
distinguish patterns and trends which cannot be adequately 
captured by probabilistic methods. We begin with a 
proposed technique based on statistical estimation and 
prediction. 


III. An Approach to Statistical Predictive Control 

The basic statistical predictive flow control mechanism 
we propose is simply as follows: at time t, predict the 
probability of an overflow at time t+s. If the predicted 
probability metric exceeds a predefined threshold, transmit 
a throttle command to all active ground stations to effect an 
immediate reduction in the number of uplink packets 
destined for that downlink buffer. 
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We assume that the packet arrival rate is determined by 
counting circuitry at each output of the destination-directed 
switch (a sampling mechanism might also be used). It 
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appears that, due to the digital nature of the system, the 
measured value of the packet arrival rate should be 
relatively free of measurement noise, obviating the need for 
statistical filtering at that point. If this is not the case, then 
optimal filtering techniques (such as the well-known Kalman 
filter) must be used. Hie measured arrival rates represent 
the composite transmission rate of all active ground stations 
to each particular downlink buffer approximately 125 ms in 
the past. The task of our proposed prediction mechanism is 
to calculate the probability of congestion within the near- 
term future (for a predefined window), compare that value 
with a preset threshold, and trigger the transmission of a 
throttle command if the threshold is exceeded. Figure 4 
illustrates the algorithmic flow. 

Our definition of congestion probability is based on the 
nonhomogeneous Poisson process model described shove. 
As before, let b t be the total amount of buffer available at 
the downlink queue and let the amount of buffer in use at 
time t be N t . The probability of congestion can be defined 
as the probability that more traffic will arrive at the queue 
in the interval [t,t+s] than can be accommodated, which is 
expressed by 


approach of neglecting departures from the buffer during the 
interval. Numerical computation of P^M+s) is relatively 
straightforward. The problem, then, is in estimating and 
predicting the value of X(t) for the near-term window. 

The choice of an appropriate prediction scheme is, 
unfortunately, rather highly dependent on the dynamics of 
the arriving traffic stream. If the traffic pattern seen at the 
downlink queue is relatively homogeneous on, say, 
successive days (as is typically the case of interactive 
computer use, for example), then it may be convenient to 
average the arrival rates over several days and derive a 
smooth, polynomial approximation X*(t) of X(t) using a 
technique such as maximum likelihood estimation. 

On the other hand, if the arrival process is characterized 
by large variations over the mean, then basing the estimate 
on average values would be insufficiently accurate for 
purposes of control. Alternate estimation procedures (of 
which there a rv many) must be chosen to take advantage of 
any deterministic structure underlying the arrival process. 
Possible choices include the Kalman filter, the Sage-Husa 
filter, LMS adaptive filters, and local linear predictive 
techniques (cf. [7], [8]). 


For example, a simple linear predictor expresses the 
value of X*(t) by linear extrapolation through two previous 
data points. Thus, 


r»-(r- 
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where X(t) is the measured value of X(t) at 7*. A little 
elementary calculus gives us that 


f X'(u)du 
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(12) 


la this case, expression (10) can be computed quite simply 
either directly or using series approximations. 
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Note that, for simplicity, we take the conservative 


Although in certain well-behaved cases a linear predictor 
can perform quite well, typically higher-order predictors are 
necessary. In addition, it is probable that the higher-order 
statistics of the arrival stream may have to be taken into 
account by the prediction mechanism, which raises several 
difficulties in applying a classical probabilistic approach. 
The overall objective, of course, is to maximize the 
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efficiency of the prediction mechanism; that is, to 
determine P c (t,t+s) such that it exceeds its threshold value 
largely when the actual arrival rate will result in an 
overflow, and rarely otherwise. 

Determination of an approximation function X*(t) of X(t), 
though beyond the scope of this paper, is one of the key 
factors in the success of the proposed control mechanism. 
An important consideration here is the accuracy of the value 
of X*(t) within the near-term window- in other words, how 
well does the estimation scheme capture the structure of 
short-term fluctuations? Ideally, we wish to minimize the 
probability of congestion due to these short term transients. 
One possible (and purely speculative) approach to this 
problem is to view the short-term transients as yet another 
stochastic process, governed by its own set of statistics. 
Thus one can envision to separate prediction mechanisms, 
one geared to predicting long-term behavior, the other to 
predicting short-term fluctuations. 

Several aspects of this scheme will require resolution. 
First, the time which must elapse between the transmission 
of a throttle command and its receipt by a ground station is 
approximately 125 milliseconds (during which time 125 
milliseconds worth of traffic is uplinked to the satellite), 
placing a bound on the minimum prediction window. 

Second, we assume that the data available to the 
prediction mechanism will consist of a finite amount of time 
series data concerning the arrival rate, distribution of packet 
lengths, sources and destinations. 

Third, the action carried out by a ground station upon 
receipt of a throttle command is taken to be a (possibly 
total) reduction in the transmission of packets destined for 
the affected queue (whereupon such packets are buffered at 
the ground station, resulting in backpressure along the 
terrestrial links to the attached nodes). We should note here 
also that a possible consequence of this backpressure is that 
upon release of the traffic, the hypothetical Poisson model 
may become a batch arrival process and must be treated 
differently. 

Fourth, the length of time rale reduction is executed at 
the ground stations must be resolved. Fifth, and finally, the 
optimal value of the preset probability threshold must be 
determined on an application-specific basis by trading off 
link efficiency and overall quality of service requirements. 

IV. Neural Network Predictive Control 

Time-series prediction using neural networks is a 
relatively new and active research area. Notable work in 
the prediction of highly nonlinear (and chaotic) time-senes 


has been performed by Jones, et ai. [9], among others. 

A 1988 report by Elman [10] describes a simple, 
partially recurrent neural network architecture designed to 
predict the successive elements in a sequence (i.e which 
predicts the state of the sequence at time t-f 1 given the 
current state and the state at time t-1). The key feature of 
this architecture is the addition of a hidden layer of neurons 
(termed the context layer) which provides the network with 
memory. Although developed for application to language 
processing, the ability of the netwoik to make predictions 
based on finite time series appears to constitute an 
appropriate basis for the control of packet arrival streams. 
Figure 5 schematically illustrates Elman's neural network 
architecture. 
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The operation of the network is fairly straightforward. 
Ignoring the context layer for the moment, the input, 
hidden, and output layers would function as a simple, 
feedforward neural net. 6^ copying the contents of the 
hidden layer back onto the context layer at each time step 
(say, at time t), the network stores the state of the hidden 
layer at time t. During the t+ 1st processing step, the 
context information is propagated back into the hidden layer, 
essentially "influencing" the current state with time-history 
data from time t. If the network is being trained (Elman 
utilizes the method of hackpropagation), then the forward 
connection weights are adjusted in the standard manner. 
The recurrent connections are fixed at unity and are not 
affected by the training. 

Because the context layer is continually updated by 
copies of the current hidden layer, a time-history is 
maintained in the network. In particular, even though only 
a single context layer is used, historical events that are of 
large magnitude (in some sense) can continue to influence 
the hidden layer over more than a single time step. 

Thus, returning to the problem at hand, the neural net 
would serve as a predictor of overflows. A singular 
advantage to this approach over the statistical method is the 
ease of using not only arrival rates as a basis for prediction, 
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but also such information-rich data as the distributions of 
packet lengths, source addresses, and destination addresses. 
In addition, this approach does not require the assumption 
of a Markovian arrival stream. Indeed, positive correlations 
between successive packets is in itself a source of data 
which could be used by an intelligent prediction mechanism; 
in fact, Elman's work in sequence analysis seeks just such 
correlative behavior. 

Operationally, we envision that the neural network would 
be trained with input vectors consisting of the arrival 
statistics given above and a binary output vector indicating 
whether or not congestion occurred within the specified 
prediction window. Typically, two neural networks would 
operate in tandem, with one undergoing online training 
while the other generates a binary signal corresponding to 
its prediction of either an overflow condition or nominal 
operation. Output of a value indicating a predicted overflow 
condition would, as in the statistical prediction mechanism, 
result in transmission of a throttle signal to the active 
ground stations. 

A clear advantage to using neural networks for this 
approach is that one need not be concerned about formally 
discriminating between first and second order arrival 
statistics, and by the difficult problem of combining several 
distributions into a multivariate prediction model. Thus, it 
may be possible to devise a much more efficient predictor 
using the neural network's inherent capability to tram on 
multivariate input data. 

Neural networks are by no means magical, however, and 
careful consideration must be given to the appropriate 
implementation of such a network for the specific 
application. As this paper 1 $ preliminary, we do not yet 
have a firm grasp of the precise network architecture that is 
required to predict congestion. It does appear, however, 
that Elman's architecture, through its ability to efficiently 
use time-history data, offers significant promise in the area 
of recursive estimation. 

V. Summary 

We outlined two alternate approaches to the predictive 
control of traffic arrival to a packet switching satellite. We 
believe such an approach to controlling congestion is 
necessary for the operational efficiency of such systems, 
since allowing the loss of earth-to-space packet traffic would 
result in serious performance degradation, perhaps causing 
the system performance to fall below the required quality of 
service. 

The first approach was based on standard statistical 
methods for dealing with traffic flows in computer and 
communication networks. A nonstationary distribution was 


used to derive a simple expression for the piobability of t 
local overflow which could, in theory, be either periodically 
or continuously calculated by an onboard processor and 
compared to preset (and possibly dynamic) thresholds to 
initiate transmission of rate control signals to the ground 
station population. 

The second method suggests using a novel class of neural 
network developed by Elman to predict an overflow based 
on a number of available traffic arrival statistics, including 
the arrival rate and distributions of packet lengths, sources, 
and destinations. The proposed neural network architecture 
utilizes a recurrent context layer of neurons to capture 
recent past behavior of the arrival stream and incorporates 
this into its (feedforward) predictive output. 

Efforts are underway by the author to test the 
effectiveness of both methods through the tool of computer- 
based simulation. Of interest are not only the practicality of 
the methods, but also their robustness to dynamic changes 
in the intensity, time-variations, and statistics of the traffic 
flow. Preliminary results using the simple linear predictive 
non-homogeneous Poisson process for random arrivals based 
on quasi-deterministic structures (simple periodic functions 
with a uniformly-distributed random component) appear 
promising. Evaluation of the control mechanisms in % more 
realistic networking scenario has recently begun. The 
problem of an effective estimator for X‘(t) has not yet been 
addressed. 

Potential applications for this work include several 
projects underway at NASA Lewis Research Center (such 
as the Information Switching Processor and Autonomous 
Network Controller) as well as applications to envisioned 
future satellite services supporting packet and fast-packet 
routing, such as ISDN and B-ISDN. 
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